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ABSTRACT 

The estimation of cosmological constraints from observations of the large scale struc¬ 
ture of the Universe, such as the power spectrum or the correlation function, requires 
the knowledge of the inverse of the associated covariance matrix, namely the precision 
matrix, \F. In most analyses, T' is estimated from a limited set of mock catalogues. 
Depending on how many mocks are used, this estimation has an associated error which 
must be propagated into the final cosmological constraints. For future surveys such 
as Euclid and DESI, the control of this additional uncertainty requires a prohibitively 
large number of mock catalogues. In this work we test a novel technique for the es¬ 
timation of the precision matrix, the covariance tapering method, in the context of 
baryon acoustic oscillation measurements. Even though this technique was originally 
devised as a way to speed up maximum likelihood estimations, our results show that 
it also reduces the impact of noisy precision matrix estimates on the derived confi¬ 
dence intervals, without introducing biases on the target parameters. The application 
of this technique can help future surveys to reach their true constraining power using 
a significantly smaller number of mock catalogues. 

Key words: large-scale structure of the Universe - methods: data analysis, observa¬ 
tional, statistics 


1 INTRODUCTION 

The small statistical uncertainties associated with current 
cosmological observations allow for precise cosmological 
constraints to be derived (e.g. Anderson et al.|2014 Planck 
[Collaboration et al.|2015| . Future stage IV experiments such 
as Euclid ( Laureijs et al.||2011| and the Dark Energy Spec- 
troscopic Instrument (DESI Eisenstein V DESI Collabora¬ 
tion 2015) will push the attainable level of precision even 


further, providing a strong test of the standard ACDM cos¬ 
mological model. 

As statistical uncertainties are reduced, the control of 
potential systematic errors becomes essential to derive ro¬ 
bust cosmological constraints. Besides the correct treat¬ 
ment of the observations and accurate models of the data, 
precision cosmology requires a thorough control of the as¬ 
sumptions made when establishing the link between theory 
and observations. For example, most analyses of clustering 
statistics assume a Gaussian likelihood function. This as¬ 
sumption must be carefully revised as they might introduce 
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systematic biases on the obtained confidence levels (Kalus 
et al.|2015 ). 

Even for the Gaussian case, the evaluation of the like¬ 
lihood function requires the knowledge of the precision ma¬ 
trix, \Eq that is, the inverse of the covariance matrix of the 
measurements. In most analyses of clustering measurements, 
the precision matrix is estimated from an ensemble of mock 
catalogues reproducing the selection function of each sur¬ 
vey (e.g. Manera et al. 2013, 2014). However, all estimates 
of \l/ based on a finite number of mock catalogues are af¬ 
fected by noise. A rigorous statistical analysis requires the 
propagation of these uncertainties into the final cosmological 
parameter constraints. 

Recent studies have provided a clear description of the 
dependence of the noise in the estimated precision matrix on 


the number of mock catalogues used (Taylor, Joachimi, V 
Kitching|2013), its propagation to the derived parameter un¬ 


certainties (Dodelson & Schneider 2013; Taylor & Joachimi 
2014) and the correct way to include this additional un¬ 


certainty in the obtained cosmological constraints (Percival 
et al.||2014 ). The results from these studies show that, de¬ 
pending on the number of bins of a given measurement, a 
large number of mock catalogues might be necessary in order 
to keep this additional source of uncertainty under control. 
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2 Paz & Sanchez 


For future large-volume surveys such as Euclid or DESI, this 
requirement might be infeasible, even if these are based on 


approximated methods such as PINOCCHIO (Monaco et al. 
2002|, COLA (|Tassev et al.|20l3| |Koda et aL|2015 ), pat chy 


^ Kitaura et al.||2014 ) or EZMOCKS (Chuang et al. 2015) in¬ 

stead of full iV-body simulations. 

In this paper we test the implementation of the Covari¬ 


ance Tapering (CT) method (Kaufman, Schervish, & Ny- 
chka : 2008) as a tool to minimize the impact of the noise in 


precision matrix estimates derived from a finite set of mock 
catalogues. Although this technique was originally designed 
as a way to speed-up the calculation of maximum likelihood 
estimates, we show that this method can also be used to re¬ 
duce the noise in the estimates of the precision matrix. The 
covariance tapering approach can help to obtain parameter 
constraints that are close to ideal (i.e. those derived when 
the true covariance matrix is known) even when the preci¬ 
sion matrix is estimated from a manageable number of real¬ 
izations. Our results show that CT can significantly reduce 
the number of mock catalogues required for the analysis of 
future surveys, allowing these data to reach their full con¬ 
straining power. 

The structure of the paper is as follows. In Sec. [2] we 
summarize the results of previous works regarding the im¬ 
pact of the noise in precision matrix estimates on cosmolog¬ 
ical constraints. The CT technique is described in Sec. lin 
Sec.|4| we apply CT to the same test case studied by |PercivaI| 


et al. 


(2014), that of normally-distributed independent mea¬ 


surements of zero mean. We then extend this analysis to a 
case with non-zero intrinsic covariances. Section [5] presents 
a study of the applicability of CT to the measurement of 
the baryon acoustic oscillations (BAO) signal using Monte 
Carlo realizations of the of the large-scale two-point cor¬ 
relation function. Finally, in Sect. [6] we present our main 
conclusions. 


2 IMPACT OF PRECISION MATRIX ERRORS 
ON COSMOLOGICAL CONSTRAINTS 


In most cosmological analyses the information from observa¬ 
tions is compressed into a measurement D, such as the power 
spectrum or the correlation function. It is commonly as¬ 
sumed that this measurement is drawn from a multi-variate 
Gaussian distribution with a given mean (D) and covariance 
matrix, C. Following a Bayesian framework, the measure¬ 
ment D can be used to constrain a set of cosmological pa¬ 
rameters 6 by means of an unbiased model T(0) = (D) (0). 
In such way, the probability that the data vector D corre¬ 
sponds to a realization of the model T(0) is given by the 
likelihood function 


£(D|0,^) oc |T^| x / 2 exp 


-i X 2 (D ,0,¥) 


where % 2 is a quadratic form 


x 2 = E - Ti (*)) - T i (*)) - 

ij 


(1) 

( 2 ) 


and \l/ corresponds to the inverse of the covariance matrix 
C, known as the precision matrix. 

The evaluation of the likelihood function requires the 
knowledge of the precision matrix, which is commonly de¬ 


rived from a set of N s independent synthetic measurements, 
D fc , based on mock catalogues matching the properties of 
the real data. The covariance matrix of the sample can be 
inferred using the unbiased estimator 


Cij = 


= - ^)p‘ -■Di 


(3) 


where Di — -jj- D\ is the mean value of the measure¬ 
ments at the z-th bin, over the set of mock catalogues, which 
provides an unbiased estimate of the ensemble average (D). 
When independent realizations are used, the statistics of 
the uncertainties in the covariance and precision matrices 
are governed by the Wishart and inverse-Whishart distribu¬ 
tions ( Wishart|1928 ), respectively. As the inverse-Whishart 
distribution is asymmetric, the inverse of C given by equa¬ 
tion <© provides a biased estimate of the precision matrix. 
However, it is possible to correct for this bias simply by in¬ 


cluding a prefactor as (Kaufman 1967 Hartlap, Simon, & 
Schneider|2007 ) 

iVb + 1 


^ = 1 - 


N B - 1 


(4) 


where AT corresponds to the number of bins in the measure¬ 
ment D. 

As this approach is based on a finite number of realiza¬ 
tions, the estimator in equation @ will be affected by noise 
( Taylor et al.|[2Q13 ), whose effect must be propagated into 
the obtained constraints on the target parameters. The ac¬ 
curacy of the obtained confidence levels on the parameters 
0 and their respective covariances is then ultimately lim¬ 
ite d by the uncertainties in the estimated precision matrix 
4* ( Taylor &; Joachimi|2014 ). 


Dodelson & Schneider (2013) performed a detailed anal¬ 


ysis of the impact of the uncertainties in T. They showed 
that, up to second order in the covariance errors, this ad¬ 
ditional uncertainty can be described by a rescaling of the 
parameter covariances {AOiAOj) by a factor 


f = l + B(N h -N p ), 


(5) 


where N p corresponds to the number of parameters mea¬ 
sured and 


B = 


(N b - AT - 2) 


(N e - N b - l)(N a - N b - 4)' (6) 

However, as shown by Percival et al. (2014), the correc¬ 
tion factor of equation © cannot be directly applied to the 
errors derived from maximum likelihood estimates (MLE) 
based on a given data set. The error in the precision matrix 
introduces a bias in the recovered parameter uncertainties, 
which then deviate from those of the ideal case in which 
the true covariance matrix is known. To take this fact into 
account, the parameter covariances recovered from the mea¬ 
surements D must be rescaled by a factor 

l + B(N h -N p ) 


9 = 


1 + A + H(AT + 1) ’ 


where B is given by equation ([6]) and 

A= 2 


(7) 


( 8 ) 


(N s -N b - 1){N S - JV b - 4)' 

Taylor fe JoachimT| ( |2014| derived general formulae for 


the full propagation of the noise due to the finite sampling 
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Figure 1. The behaviour of the family of tapering functions used 
in this work (the Wendland 2.0 function class) corresponding to 
different values of the tapering parameter T p . The abscissas rep¬ 
resent the distance between two measurement locations (denoted 
by r*i) in data set space, which are shown in units of data ordinals 
(i.e bin units, r% = i). Larger values of T p result in functions with 
a larger support interval. 


of the data covariance matrix into the parameter covari¬ 
ance estimated from the likelihood width and peak scat¬ 
ter estimators, which do not coincide unless the data co- 
variance is exactly known. Their results are in good agree¬ 
ment with the second-order approximations of |Dodelson & 
Schneider (2013) and Percival et al. (2014) in the regime of 


N s 


Nb N p , but show deviations from these results for 


smaller number of simulations. 

In the large N s limit, the correction factor of equa¬ 
tion 0 can be used to obtain constraints that correctly 
account for the additional uncertainty due to the noise in 
the precision matrix estimate of equation Q. However, this 
additional uncertainty in the parameter covariance matrix, 
hinders the constraining power of the data. Given the num¬ 
ber of bins in a measurement D and the number of pa¬ 
rameters that one wishes to explore, the correction factor of 
equation 0 can be used to estimate the number of synthetic 
measurements required to reach a given target accuracy in 
the derived constraints. If the number of bins in a given 
measurement is large, as could be the case for anisotropic 
or tomographic clustering measurements, the required num¬ 
ber of mock realizations to keep the additional uncertainty 
under control might become infeasible. The aim of our anal¬ 
ysis is to test a new technique to reduce the impact of the 
uncertainties in IF on the final parameter constraints, which 
could help to significantly relieve these requirements. 


3 COVARIANCE TAPERING FOR 

LIKELIHOOD-BASED ESTIMATION 

In this section we describe the covariance tapering technique 


developed by Kaufman et al. (2008) to improve the efficiency 


on the computation of MLE. This method was first applied 


nal idea behind CT is the fact that, in many applications, 
the correlation between data pairs far apart is negligible and 
little information is lost by treating these points as being in¬ 
dependent. In this case, setting their corresponding elements 
in the covariance matrix to be exactly zero makes it possi¬ 
ble to take advantage of fast numerical methods for dealing 
with sparse matrices, leading to a significant speed up of the 
evaluation of the likelihood function. 

However, in this work we focus on a different use of CT. 
The off-diagonal elements of the covariance matrix of a gen¬ 
eral measurement might exhibit a wide range of values and 
uncertainties. Even when these elements are non-negligible, 
their relevance to obtain an accurate description of the like¬ 
lihood function must be assessed in terms of their associated 
errors. In this way, by down-weighting the contribution of 
these points (which typically have a low signal to noise ratio) 
to the estimated precision matrix it is possible to avoid the 
propagation of errors into the final cosmological constraints. 


trix. 


Kaufman et al. (2008) define a tapered covariance ma- 
C 1 , in terms of the estimate C of equation (B| and a 


properly defined tapering matrix T as 
C 1 = CoT, 


( 9 ) 


where o indicates the Hadamard product (i.e. the entry- 
wise product). The Schur product theorem guaranties that 
if C and T are positive definite matrices, then so is C 1 . 
The tapering matrix T is defined as an isotropic covariance 
matrix by means of a taper function K as 


Tij — K(||ri Tj 


( 10 ) 


where ri is the i-th measurement location on the data space 
(e.g. the bin separation for correlation function measure¬ 
ments). The function K could in principle be any positive 
compact-support function intended to nullify the covariance 
matrix entries far away from the diagonal. However only 
certain types of functions (the Matern class) ensure the 
asymptotic convergence of the method to the desired maxi¬ 
mum likelihood estimate (see theorems 1 and 3 in |Kaufman 


et al. 2008). Following Kaufman et al. (2008) we use the 


monoparametric family of functions defined by Wendland 
{1995 19981 


K(x) = 


-It 


1 - — 
1 Tj 


L) 4 ( 4 t + 1 ) ifx<T * 

if X > Tr, 


( 11 ) 


Fig. [I] shows the behaviour of these functions for different 
values of the tapering parameter, T p , which defines the size 
of the function support (i.e. the interval where K takes non¬ 
zero values, for more details see Kaufman et al.|2008 ). 

As shown by Kaufman et al. (2008), the estimation of 


the precision matrix simply as the inverse of the tapered 
covariance C 1 can introduce systematic biases on the ob¬ 
tained parameter constraints. However, by applying a sec¬ 
ond Hadamard product to the inverse of and estimating 
the precision matrix as 


'F = 1 - 


A b + 1 
W 


?i) ( e ° T r‘ 


( 12 ) 


to clustering measurements in Paz et al. (2013). The origi- 


it is possible to obtain an unbiased and robust estimate of 
the precision matrix over a large range of T p values. 

The likelihood is then estimated by replacing \F in equa¬ 
tion^ by the two-tapered precision matrix estimator,^ 1 . In 
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Figure 2. Variance of 10 5 MLE of the mean of sets of N b = 15 
uncorrelated normal random variables, as a function of the num¬ 
ber of independent data realizations used on estimation of the pre¬ 
cision matrix. Open symbols correspond to the variance obtained 
when the standard method is used (i.e. the precision matrix is 
approximated by \E r ). The dashed line shows the results obtained 
when applying the CT technique (i.e. the precision matrix is ap¬ 
proximated by SEd) with a tapering parameter of 10 bins. The 
solid line corresponds to the analytic formulae given in Dodelson 
|&; Sch neider (2013). The dotted line correspond to the expected 
variance for the mean of uncorrelated Gaussian random variables 
with first and second moment equal to zero and one, respectively. 


the following sections we will show that the CT method is 
not only useful to approximate the MLE in a more computa¬ 
tionally efficient way, as shown in Kaufman et al. (2008). As 
we will see, CT is also an appropriate technique to signifi¬ 
cantly reduce the impact of the noise in the precision matrix 
estimated from a set of mock measurements, increasing in 
this way the precision of the obtained likelihood confidence 
regions. 


4 COVARIANCE TAPERING IN PRACTICE 

4.1 Testing CT on independent 

normal-distributed measurements 


In this section we compare the performance of the CT ap¬ 
proach with the standard MLE method when applied to 
the case of Nb independent normal-distributed random vari¬ 
ables, in a similar analysis to those performed by Dodelson 


&; Schneider (2013); Taylor et al. (2013) and Percival et al. 


(2014). This simple test is able to illustrate how the CT 
technique can be used to minimize the effects of covariance 
errors on the estimation of likelihood confidence intervals. 

Our data set for this test consists of Nb Gaussian num¬ 
bers with null mean and standard deviation equal to unity, 
in which case the covariance matrix is given by the Nb x Nb 
identity matrix. However, we perform a MLE of the sample 
mean, /x, without assuming any data independence. We first 
compute C from a set of N s independent Monte-Carlo real¬ 
izations of Nb independent Gaussian numbers Df. This esti¬ 


mate can be used to obtain 4/ and 4> using eqs.(|3j) and (12). 
Both of these estimates will include “apparent” correlations 
between different bins, due to the noise in the off-diagonal 
elements. We generate an additional independent set of Nb 
Gaussian numbers to be used as the data set for the target 
parameter estimation. The estimation of the sample mean 
is achieved by maximizing the likelihood function in equa¬ 
tion 0- In this case the model is quite simple, a constant 
function T*(/x) = /x. This procedure is repeated 10 5 times, 
obtaining an estimation for the target parameter on each 
time. By using the set of all the estimated values of /x we 
are able to compute the standard error a achieved by the 
MLE method. The use of an independent data set on pa¬ 
rameter estimation, employing the first N s samples for the 
estimation of C, gives an unbiased set of target parameter 
estimations (Percival et al. 2014). 

The results of this test are shown in Fig. [2] where the 
open points correspond to the parameter variance obtained 
when the standard technique is used, i.e. the precision ma¬ 
trix is approximated by \&, for the case of Nb = 15. The 
comparison of these results with the dotted line, which corre¬ 
sponds to the true expected variance of the mean of a Gaus¬ 
sian random variable, illustrates the effect of the noise in the 
covariance matrix. As can bee seen, the variance decreases 
as N s increases, which is expected due to the corresponding 
improvement on the C estimations. This behaviour is well 
described by the formulae given in |Dodelson & Schneider| 
(2013), as can bee seen by looking at the dashed line in Fig. 
[2] The same behaviour has been seen over a wide range of 
Nb- The agreement found here is consistent with the results 
of Taylor & Joachimi (2014), given that the cases considered 
here correspond to the regime of large N s compared to the 
number of parameters and data bins used. The solid line cor¬ 
responds to the variance obtained when the precision matrix 
is estimated using of equation (12) with a tapering scale 
of 10 bins. The application of the CT method significantly 
reduces the impact of the noise in the variance of the target 
parameter, leading to results that are much closer to those 
of the ideal case. 


4.2 Testing the CT method for realistic 
covariances 

In the previous section we tested the results obtained by 
applying CT in an ideal case in which the true covariance 
matrix is diagonal. In this section we extend this test by 
considering the case in which the different elements of the 
dataset have non-negligible correlations. To this end, we gen¬ 
erate Monte Carlo realizations of Nb Gaussian random vari¬ 
ables with zero mean, correlated following a realistic model 
of the covariance matrix of the two-point correlation func¬ 


tion (Sanchez et al. 2008). The upper triangular of the top- 
left panel of Fig.j3]shows the model for the covariance matrix 
for the case of Nb = 50, normalized as indicated in the fig¬ 
ure key (that is, the correlation matrix). For comparison, the 
lower triangular part of the same panel shows the estimate 
C obtained using 300 independent Monte Carlo realizations. 
As can be seen, the main features of the model covariance 
matrix are recovered. However the presence of noise is clear 
in the off-diagonal elements of the matrix, corresponding 
to covariances of measurements at large separations. The 
signal-to-noise ratio of the covariance matrix is smaller for 
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Correlation Matrix [normalized] 

0.0 0.2 0.4 0.6 0.8 1.0 


Precision Matrix [normalized] 
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0 IxlO 7 2x10 7 3x10 7 4x10 7 

Std. dev. of Precision estimates 


i i 

- 0.2 - 0.1 0.0 0.1 0.2 

Precision Matrix [normalized] 


Figure 3. Top left panel: Comparison between the model for the covariance matrix used in this work (upper triangular part) and 
the standard estimation using 300 realizations. Top right panel: The true precision matrix \I/ (the matrix inverse of the model, upper 
triangular part) compared to the usual estimator Bottom right panel: As in the above panel, but comparing this time Sfr with the CT 
estimator using T p = 230h -1 Mpc. Bottom left panel: Standard deviations for 10 4 precision matrix estimates, with (lower triangular 
part) and without CT (upper triangular part), around the true matrix T r . 


the off-diagonal elements. The main idea of this work is to 
control the propagation of these errors into the precision 
matrix, restricting in this way their impact on the likeli¬ 
hood function and the obtained confidence intervals of the 
target parameters. 

The top right panel of Fig. [ 3 ] shows the precision ma¬ 
trices corresponding to the model for the covariance ma¬ 
trix (upper triangular part) and the one obtained using the 
standard estimator of equation 0. normalized in analogous 


manner to the covariance matrices. As can be seen, the pres¬ 
ence of noise in the off-diagonal elements is even more re¬ 
markable in the case of the precision matrix. The increment 
on the noise is due to the propagation of errors during the 
matrix inversion operation. The bottom-right panel of Fig. 
[ 3 ] shows a comparison of the precision matrix obtained by 
applying CT with a tapering parameter T p = 230/i -1 Mpc 
(lower triangular part) and the model precision matrix (up¬ 
per triangular part, identical to the one shown in the upper 
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Figure 4. Error of the mean of N b = 30 random Gaussian num¬ 
bers with zero mean and covariance given by the model of |Sanchez| 
|et a l. ([2008}, normalized to the ideal error cr^eal predicted by the 
model. The dashed line corresponds to the error obtained using 
the standard method, where the 4/ estimator is taken as a proxy 
of the ideal precision matrix. The solid line shows the results cor¬ 
responding to the CT technique, where the precision matrix is 
estimated by IE© The dotted line corresponds to the ideal error. 


part of the top-right panel). The application of CT leads 
to a significant suppression of the noise in the off-diagonal 
elements of the precision matrix. 

The improvement in the accuracy obtained by applying 
CT can be quantified by computing the deviations of the 
different estimators from the true model precision matrix. 
The lower left panel of Fig. [3] shows the standard deviations, 
element by element, of the estimators IF and SF* for N s = 
300, with respect to the ideal precision matrix. The upper 
triangular part corresponds to the deviations obtained using 
the standard method (which are well approximated by the 
analytic formulae given in [Taylor et al.|[2013 ), whereas the 
lower triangular part shows the results obtained using the 
CT technique. The standard deviations obtained when CT 
is applied are smaller than those achieved by the standard 
technique, indicating a better performance at recovering the 
correct underlying precision matrix. 

The success of the covariance tapering method depends 
on the selection of an adequate tapering scale T p . In the case 
analysed in this section, a natural way to characterize the 
different approaches is through the recovered error in the 
mean fi. The ideal error, which represent the true constrain¬ 
ing power of a given data set, can be easily computed by 
taking the inverse of the single element of the fisher matrix. 
As the derivative of the model with respect to the target pa¬ 
rameter is 1 for each of the N b dimensions, the ideal error of 
the mean is simply given by crideai = 1/ T ij • Fig.Elshows 
a comparison between the standard and CT methoos for a 
fixed number of bins and simulations (TVb = 30, N s = 100). 
For each method we show the ratio cr M /crideai • The solid line 
corresponds to the results obtained by applying CT as a 
function of the tapering scale T p . The dashed line shows to 
the error obtained using the standard method, where the 


precision matrix is estimated by SF. The standard deviation 
obtained in this case corresponds to an excess of 45% with 
respect to that of the ideal case, indicated by the dotted 
line. As can be seen, there is an optimal tapering scale of 
T p ~ 230 /i _1 Mpc), for which the error obtained is only 10% 
larger than the ideal value. For large tapering scales, the CT 
method recovers the same results as the standard technique, 
whereas for small T p the errors become larger again. This 
behaviour might be given by the relation between the ta¬ 
pering scale and the structure of the off-diagonal elements 
of the covariance matrix. For large values of T p , the tapering 
procedure damps the contribution of the most off-diagonal 
elements of the covariance matrix, whose intrinsic values are 
small and are dominated by noise. As the tapering scale is 
reduced, the damping of the noise is more efficient and the 
results become more similar to those of the ideal case. How¬ 
ever, if the tapering scale is too small, this procedure might 
affect entries of the covariance matrix whose intrinsic values 
are not small. As these off-diagonal elements are affected, 
the obtained results deviate again from those of the ideal 
case. 

In Fig. [5] we show the results of extending this test to 
a wide range of N s values. The variance of the MLE of the 
mean of 10 5 independent data sets of N d — 30 correlated 
Gaussian numbers is shown as a function of the number 
of realizations employed in the estimation of C. The points 
correspond to the variance inferred using the standard tech¬ 
nique, whereas the solid coloured lines indicate the results 
obtained by applying CT with different T p values, as indi¬ 
cated in the figure key. The dashed line corresponds to the 
expected ideal variance of the mean, crf deal . As can bee seen 
in the left panel of this figure, the variance recovered by ap¬ 
plying CT with large tapering scales is very similar to the re¬ 
sult of the standard method. However, as the tapering scale 
decreases, the variance also becomes smaller, reaching values 
close to those of the ideal case for T p ~ 230/i _1 Mpc. The 
lower panels in Fig. [5] show the behaviour of the estimated 
values of the target parameter for the standard (points) and 
CT (solid lines) methods. In all cases the recovered values 
of 11 show no indication of a systematic deviation from the 
true value fi — 0, with only small fluctuations for different 
values of N s . 

The right panel of Fig. [5] shows the results obtained by 
applying the CT technique for values of the tapering scale of 
less than 230 /i _1 Mpc, which result in larger variances of the 
target parameter. However, it is worth noticing that even in 
this case there is no bias in the obtained constraints. These 
results suggest that the optimal tapering scale depends only 
on the shape of the underlying covariance matrix, rather 
than the number of independent data samples used to com¬ 
pute C. 


5 APPLICATION TO BARYON ACOUSTIC 
OSCILLATION MEASUREMENT 


In this section we analyse the applicability of the covariance 
tapering method, in the context of baryon acoustic oscilla¬ 
tions (BAO) measurements. For this test we use the model of 
the full shape of the large-scale two-point correlation func¬ 
tion, £(s), of Sanchez et al. (2013 2014), which is based 


on renormalized perturbation theory (Crocce & Scoccimarro 
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Figure 5. Variance of 10 5 MLE of the mean of Gaussian numbers, following a realistic model of the covariance. The results are 
presented as a function of the number of samples used in the covariance estimation. Open symbols correspond to the variance obtained 
when the precision matrix is estimated through the standard method (i.e. 'F —» \F). Solid coloured lines show the results obtained by 
CT using different T p parameter values, as indicated in the key. The dashed line correspond to the expected variance for the mean, that 
is the inverse of the fisher element. Left panel corresponds to T p values between 230 and 3330h -1 Mpc, whereas right panel shows the 
results of T p ranging from 66 to 230h -1 Mpc. The bottom subpanels at both sides show the recovered /i values for all methods. 


2006). We generate sets of correlated Gaussian numbers with 


mean given by the model of £(s) and the same covariance 


matrix as in the test of the previous section (Sanchez et al. 


2008). In this way, each Monte Carlo realization mimics a re¬ 


alistic measurement of the correlation function at the scales 
relevant for BAO measurements. 

As in previous sections, we generate sets of N s ran¬ 
dom realizations of the correlation function, sampled using 
iVb = 30 bins, varying N s between 40 and 420. We use these 
synthetic samples to obtain an estimate of the covariance 
matrix. These realizations are analogous to the measure¬ 
ments from mock galaxy catalogues used to compute C in 
most clustering analyses. We generate an additional syn¬ 
thetic measurement to serve as the objective data sample of 
the two-point correlation function, on which we perform fits 
of the BAO signal following the methodology of Anderson] 


et al. (2014). More precisely, we fit the BAO peak position 


using a parametrization for £(s) given by 


•fmod(-s) — b 2 ^temp(ois) + CLo -|-1-(13) 

S S z 

where £tem P (s) is a template given by the model of the corre¬ 
lation function, the parameter b corresponds to a large-scale 
bias factor, the shift parameter a is used to control the po¬ 
sition of the BAO peak, and ao, cli and a 2 are additional 
parameters used to marginalize over the broad-band signal. 
These fits are performed using the standard estimation of 
the precision matrix, SF, and the CT estimate, SF 1 , with 
varying tapering scales. We explore the parameter space 
6 = (a, 6, ao, ai, 02 ) using the Markov chain Monte Carlo 
(MCMC) technique. We focus here on the constraints on a , 
marginalizing over the remaining parameters. 


The procedure described above is repeated 2 x 10 5 times 
to obtain a smooth measurement of the dispersion of the a 
values obtained from different realizations. Fig. [6] shows the 
behaviour of this dispersion as a function of the number of 
mock samples used in the estimation of the covariance ma¬ 
trix. The points correspond to the results obtained when 
the precision matrix is approximated by the estimator SF of 
equation As expected, the error in a decreases as N s in¬ 
creases, approaching the ideal error obtained when the true 
covariance matrix is used to evaluate the likelihood func¬ 
tion, shown by the black solid line. The results from our 
Monte Carlo realizations are well described by the analytic 
formulae given by Dodelson & Schneider (2013), which is 
shown by the blue short-dashed line. The red long-dashed 
line corresponds to the result of rescaling the mean variance 
on a recovered from each MCMC by the correction factor 
of equation 0 which, as shown by jPercival et ah] (| 2014| ), 
correctly accounts for the additional error due to the noise 
in C. 


The thin solid lines in Fig. [6] correspond to the results 
obtained using the CT method, colour-coded according to 
the corresponding tapering scale. For large tapering scales 
the results closely resemble those of the standard technique. 
As the tapering parameter decreases, approaching the opti¬ 
mal scale found in the previous section of 230/i _1 Mpc, the 
dispersion of the BAO scale estimates becomes closer to the 
ideal error. These results show that, for a given value of iV s , 
the CT method leads to measurements of the BAO scale 
that are closer to the true constraining power of the data 
than the standard technique. For N s ~ 400 the uncertainty 
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Figure 6. Standard deviation of the BAO peak scale as a func¬ 
tion of the number of simulations used in the covariance estimate 
(upper panel). Open circles indicate the results corresponding to 
the standard method. Long and short dashed lines corresponds to 
the analytic formulae of |Dodelson & ; Schneider ( 2013]) and |Perch| 
|val et al~| |2014). The black solid line indicates the error obtained 
when the ideal precision matrix is used. Coloured lines (from blue 
to red) show the results obtain by applying CT with different ta¬ 
pering scales. The black dot-dashed line show the CT results when 
a too small tapering parameter is used (T p = 80h -1 Mpc). 


obtained using the CT method essentially recovers that of 
the ideal case. 

The black dot-dashed line in Fig. [6] corresponds to the 
CT results obtained by setting T p = 80 h~ x Mpc, illustrating 
the effect of implementing a too small tapering scale. As we 
found in the previous section, selecting a too small tapering 
scale leads to an increment of the errors in the target pa¬ 
rameters. However, as shown in the lower panel of Fig. [5] 
even in this case the CT results show no systematic bias, 
with negligible differences from the true underlying param¬ 
eter atrue = 1 for the full range of N s values analysed. 

The improvement of the constraints achieved by the CT 
technique with respect to the standard method ultimately 
relays in a closer approximation of the ideal likelihood sur¬ 
face. In Fig. [T] we show the difference between the mean 
marginalized posterior distribution of the shift parameter 
a obtained by applying CT with different tapering scales 
(green and purple lines, as indicated in the figure key) and 
that of the ideal case. The posterior distribution correspond¬ 
ing to the standard method is shown as a solid orange curve. 
In general, the CT results provide a closer approximation of 
the ideal distribution than the standard method, most no¬ 
tably for a tapering scale of 230 /i -1 Mpc. The use of a larger 
tapering scale leads to larger deviations from the ideal likeli¬ 
hood function. However, in all of these cases they are closer 
to the optimal result than the standard method. In con¬ 
trast, using a too small tapering scales (as in the case of 
T p = 50/i _1 Mpc shown in the figure) leads to deviations 
from the underlying likelihood function that can be even 
larger than those obtained with the standard technique. This 



a 

Figure 7. Difference between the mean marginalized posterior 
distribution of the shift parameter a obtained by applying CT 
with different tapering scales (green and purple lines) and that 
of the ideal case (gray line). The orange curve shows the results 
obtained by the standard method. 


highlights the importance of performing a careful analysis of 
the optimal tapering scale for each case, which will depend 
on the structure of the covariance matrix. 


6 SUMMARY AND CONCLUSIONS 


In this work we have implemented and tested a novel tech¬ 
nique for the estimation of the precision matrix, the covari¬ 
ance tapering method, developed by Kaufman et al. (2008). 


We have analysed the performance of the CT method and 
the standard technique by comparing the obtained parame¬ 
ter constraints with those found in the ideal case where the 
true precision matrix is known. For the latter case, the size 
of the errors in the estimated parameters is only governed 
by the constraining power of the data sample. Therefore, 
any excess seen in the recovered errors in comparison with 
that of the ideal case can be associated with the noise in the 
precision matrix estimation. 

We first carried out an analysis similar to those per¬ 


formed by Dodelson &; Schneider (2013), Taylor et al. (2013) 
and Percival et al. (2014), using the standard deviation of 


the maximum likelihood estimation of the mean of Nb in¬ 
dependent normal random variables as a test case. Our re¬ 
sults are in agreement with previous works, indicating that 
the noise in the covariance matrix estimation has a signifi¬ 
cant impact on the uncertainty obtained on parameter cons¬ 
traints. We found that the size of the synthetic data set used 
to estimate the precision matrix is crucial to control the im¬ 
pact of the covariance uncertainties on the final parameter 
constraints. Our results also show that CT helps to reduce 
the error in the precision matrix estimation, leading to un¬ 
certainties in the target parameters that are closer to the 
ideal case than those obtained with the standard method. 

The efficiency of the CT technique for a data set com¬ 
posed of independent normal variables, where the true co- 
variance matrix is diagonal, can be expected. In order to 
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check the validity of the CT method in a more realistic sit¬ 
uation, we also studied the case of the MLE of the mean 
of Gaussian numbers with non-zero correlations given by 
a realistic model of the covariance matrix of the two-point 
correlation function ( Sanchez et al.|2008 ). Our results show 
that also in this case the covariance tapering method leads 
to smaller errors than the standard technique, without in¬ 
troducing any systematic bias in the estimated parameters. 
In this case we found an optimal tapering scale, defined as 
the value of the tapering parameter for which the obtained 
standard deviation is closer to its ideal value. For example, 
in the case of 30 normal distributed values and 100 synthetic 
samples on the covariance estimation, the CT method with 
the optimal tapering scale, gives a standard deviation only 
10% larger than that of the ideal case, whereas the standard 
technique results in errors 45% larger. We showed that the 
optimal tapering parameter depends only on the structure 
of the underlying covariance matrix and is insensitive to the 
bin size or the number of synthetic samples used in the esti¬ 
mation of the precision matrix. Smaller tapering parameters 
than the optimal value result in an increase of the standard 
deviation, although no bias is introduced. 

Finally, we performed an analysis of the CT technique 
on a more realistic context by testing its applicability to 
isotropic BAO measurements. We used an accurate model 


of the large-scale two-point correlation function (Sanchez 


et al. 2013) and its full covariance matrix (Sanchez et al. 


2008) to generate Monte Carlo realizations of this quantity. 


We used these synthetic measurements to perform fits of the 
BAO signal following the methodology of |Anderson et ah] 
(2014). We found that the CT method significantly reduces 
the impact of the noise in the precision matrix on the ob¬ 
tained errors in the BAO peak position without introducing 
any systematic bias. As in the previous case, the optimal 
tapering parameter only depends on the shape of the true 
covariance matrix, with a preferred T p scale similar to that 
of the previous test (i.e. the case of zero-mean correlated 
Gaussian numbers). 

Covariance tapering can help to reduce the required 
number of mock catalogs for the analysis of current and 
future galaxy surveys. This can be clearly illustrated by ex¬ 
tending the analysis of section [5] to the case of N s = 600, 
corresponding to the number of mock catalogues used in the 
analysis of the SDSS-DR9 BOSS clustering measurements of 


Anderson et al. (2012). In this case, the uncertainty in the 


BAO shift parameter obtained by applying the CT technique 
is equivalent to that derived with the standard method using 
7V S = 2300 instead. 

As we highlighted before, the performance of the CT 
technique ultimately depends on the structure of the under¬ 
lying covariance matrix. In this work we assumed a Gaussian 
model for the covariance matrix of the correlation function. 
Although this model gives an excellent description of the re¬ 
sults of numerical simulations, more accurate models must 
include also the contribution from modes larger than the 
survey size (de Putter et al. 2012) or non-Gaussian terms 
( Scoccimarro et al.||1999 ) that would affect the off-diagonal 
elements of the covariance matrix. We leave the study of the 
performance of the CT under the presence of these contribu¬ 
tions for future work, as well as the extension of the present 
analysis to alternative data sets such as the power spectrum 
or anisotropic clustering measurements in general. 


The covariance tapering technique can be extremely 
useful for the analysis of future surveys such as Euclid and 
DESI. The small statistical uncertainties associated with 
these data sets will provide strong tests for the standard 
ACDM cosmological model. However the large number of 
mock catalogues that are required by the standard technique 
to maintain the accuracy level of the cosmological cons¬ 
traints might be infeasible. The application of the covari¬ 
ance tapering technique can significantly reduce the number 
of mock catalogues required for the analysis these surveys, 
allowing them to reach their full constraining power. 
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